Skip to content

Conversation

@Dan-Flores
Copy link
Contributor

@Dan-Flores Dan-Flores commented Oct 14, 2025

This PR enables encoding a video to tensor by adding a VideoEncoder constructor that accepts an AVIOContextHolder and file format. It is instantiated with AVIOToTensorContext.

Testing:

  • test_video_encoder_round_trip is updated to test to_tensor method.
  • The changes in AVIOTensorContext.cpp enable certain formats, such as avi and flv to be encoded correctly.
    • Although we do not test these container formats in test_video_encoder_round_trip, without the changes, they would raise an error when attempting to decode the encoded frames, indicating an incorrectly encoded video.
    • The added max field allows encoders that write the header last (doing a backwards seek) to output a correctly encoded tensor.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 14, 2025
@Dan-Flores Dan-Flores marked this pull request as ready for review October 14, 2025 19:51
Copy link
Contributor

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @Dan-Flores, nice work ! Left a few comments and suggestions but this looks great

create_from_tensor,
encode_audio_to_file,
encode_video_to_file,
encode_video_to_tensor,
Copy link
Contributor

@NicolasHug NicolasHug Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some more test:

These tests can be easier to write (and read!) when we have the public Python VideoEncoder class. So if you prefer this can be left as a TODO, but we should make sure to follow up on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a similar test to this PR, hopefully it is fairly readable.
I may move some tests to test_encoders.py once I push a PR with the python API, since we really want to test the VideoEncoder entrypoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parametrizing test_bad_input without a VideoEncoder class is fairly messy. I'll move this test to test_encoders.py and clean it up. For now, I've added the case where a bad format is passed in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, maybe we can just write a TODO to parametrize test_bad_input when we move it to test_encoders.py ?

test/test_ops.py Outdated
assert_close(s_frame, rt_frame, atol=atol, rtol=0)

@pytest.mark.parametrize(
"format", ("mov", "mp4", "avi", "mkv", "webm", "flv", "gif")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mark the webm entry as slow (if it is!), see #968 (comment) for how to do that just for the webm entry, and not to the whole test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointer, I've added the mark. The webm tests are in fact very slow.

I suspect its because the encoding is done in GRB instead of YUV, thanks to avcodec_find_best_pix_fmt_of_list choosing the least lossy format, and the webm encoder supports GRB. We will probably want to change this for efficiency.

pytest --durations=25 -k "TestVideoEncoderOps"
============================================================ slowest 25 durations ============================================================
25.98s call     test/test_ops.py::TestVideoEncoderOps::test_against_to_file[webm]
18.30s call     test/test_ops.py::TestVideoEncoderOps::test_video_encoder_against_ffmpeg_cli[webm]
13.53s call     test/test_ops.py::TestVideoEncoderOps::test_video_encoder_round_trip[to_tensor-webm]
13.33s call     test/test_ops.py::TestVideoEncoderOps::test_video_encoder_round_trip[to_file-webm]
3.67s call     test/test_ops.py::TestVideoEncoderOps::test_video_encoder_against_ffmpeg_cli[gif]
3.65s call     test/test_ops.py::TestVideoEncoderOps::test_against_to_file[gif]
3.03s call     test/test_ops.py::TestVideoEncoderOps::test_video_encoder_against_ffmpeg_cli[mkv]
2.78s call     test/test_ops.py::TestVideoEncoderOps::test_video_encoder_against_ffmpeg_cli[mov]
2.71s call     test/test_ops.py::TestVideoEncoderOps::test_video_encoder_against_ffmpeg_cli[mp4]
2.35s call     test/test_ops.py::TestVideoEncoderOps::test_against_to_file[mkv]
2.26s call     test/test_ops.py::TestVideoEncoderOps::test_against_to_file[mp4]
2.21s call     test/test_ops.py::TestVideoEncoderOps::test_against_to_file[mov]
2.19s call     test/test_ops.py::TestVideoEncoderOps::test_video_encoder_round_trip[to_tensor-mp4]
2.09s call     test/test_ops.py::TestVideoEncoderOps::test_video_encoder_round_trip[to_file-mov]

Copy link
Contributor

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Dan-Flores ! Left 3 non-blocking comments (one above, two below)

LGTM

Comment on lines 1463 to 1480
torch.testing.assert_close(
self.decode(encoded_file).data, self.decode(encoded_tensor).data
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are uint8 so the default atol will be 0 anyway, but it's good to be explicit:

Suggested change
torch.testing.assert_close(
self.decode(encoded_file).data, self.decode(encoded_tensor).data
)
torch.testing.assert_close(
self.decode(encoded_file).data, self.decode(encoded_tensor).data, rtol=0, atol=0
)

create_from_tensor,
encode_audio_to_file,
encode_video_to_file,
encode_video_to_tensor,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, maybe we can just write a TODO to parametrize test_bad_input when we move it to test_encoders.py ?

@Dan-Flores Dan-Flores merged commit 2117716 into meta-pytorch:main Oct 17, 2025
58 checks passed
@Dan-Flores Dan-Flores deleted the video_to_tensor branch October 17, 2025 13:30
NicolasHug pushed a commit to NicolasHug/torchcodec that referenced this pull request Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants